A Fully-Lexicalized Probabilistic Model for Japanese Syntactic and Case Structure Analysis
نویسندگان
چکیده
We present an integrated probabilistic model for Japanese syntactic and case structure analysis. Syntactic and case structure are simultaneously analyzed based on wide-coverage case frames that are constructed from a huge raw corpus in an unsupervised manner. This model selects the syntactic and case structure that has the highest generative probability. We evaluate both syntactic structure and case structure. In particular, the experimental results for syntactic analysis on web sentences show that the proposed model significantly outperforms known syntactic analyzers.
منابع مشابه
Probabilistic Coordination Disambiguation in a Fully-Lexicalized Japanese Parser
This paper describes a probabilistic model for coordination disambiguation integrated into syntactic and case structure analysis. Our model probabilistically assesses the parallelism of a candidate coordinate structure using syntactic/semantic similarities and cooccurrence statistics. We integrate these probabilities into the framework of fully-lexicalized parsing based on largescale case frame...
متن کاملA Fully-Lexicalized Probabilistic Model for Japanese Zero Anaphora Resolution
This paper presents a probabilistic model for Japanese zero anaphora resolution. First, this model recognizes discourse entities and links all mentions to them. Zero pronouns are then detected by case structure analysis based on automatically constructed case frames. Their appropriate antecedents are selected from the entities with high salience scores, based on the case frames and several pref...
متن کاملBilexical Grammars and a Cubic-time Probabilistic Parser
Computational linguistics has a long tradition of lexicalized grammars, in which each grammatical rule is specialized for some individual word. The earliest lexicalized rules were word-specific subcategorization frames. It is now common to find fully lexicalized versions of many grammatical formalisms, such as context-free and tree-adjoining grammars [Schabes et al. 1988]. Other formalisms, suc...
متن کاملA model of syntactic disambiguation based on lexicalized grammars
This paper presents a new approach to syntactic disambiguation based on lexicalized grammars. While existing disambiguation models decompose the probability of parsing results into that of primitive dependencies of two words, our model selects the most probable parsing result from a set of candidates allowed by a lexicalized grammar. Since parsing results given by the lexicalized grammar cannot...
متن کاملFrom Linguistic Theory to Syntactic Analysis: Corpus-Oriented Grammar Development and Feature Forest Model
The goal of this thesis is to establish a system for the automatic syntactic analysis of real-world text. Syntactic analysis in this thesis denotes computation of in-depth syntactic structures that are grounded in syntactic theories like Head-Driven Phrase Structure Grammar (HPSG). Since syntactic structures provide essential components for computing meanings of natural language sentences, the ...
متن کامل